-
Notifications
You must be signed in to change notification settings - Fork 620
os/arch/arm/src/amebasmart: fix sem_timedwait(&g_read_cmd_done) not b… #6994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
3ba2295 to
2b988ce
Compare
2b988ce to
ed10313
Compare
ed10313 to
313e0ab
Compare
313e0ab to
bdf7f99
Compare
seokhun-eom24
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about enabling the MIPI interrupt when the transfer starts and disabling it when it finishes?
In the current changes, the reason for calling sem_init in the Fail_case is to initialize the semaphore after a timedwait timeout so that it cannot be consumed later.
However, this still does not prevent a situation where an ISR occurs after the semaphore is initialized and performs a post.
Also, if we do it this way, I think send_cmd_done and receive_cmd_done become unnecessary, so please review this approach.
This PR addresses a potential issue that could arise on timeout; even if it isn’t merged quickly, I’d like to proceed in a way that fully resolves the problem.
Currently, when we switch to cmd mode to send cmd, we are enabling the interrupt, and disable it after switch to video mode. static void rtl8730e_mipi_mode_switch(mipi_mode_t mode)
{
if (mode == CMD_MODE) {
mipi_mode_switch_to_video(false);
MIPI_DSI_INT_Config(MIPI, DISABLE, ENABLE, FALSE);
LCDC_INTConfig(pLCDC, LCDC_BIT_DMA_UN_INTEN, DISABLE);
DelayMs(140);
} else {
MIPI_DSI_INT_Config(MIPI, DISABLE, DISABLE, FALSE);
LCDC_INTConfig(pLCDC, LCDC_BIT_DMA_UN_INTEN, ENABLE);
mipi_mode_switch_to_video(true);
}
}and it will be disable at the end of cmd sent if the msg->type == MIPI_DSI_END_OF_TRANSMISSION..
|
How about moving the interrupt enable/disable that operates when the mode changes to the point where the transfer occurs? Doing this would, as mentioned earlier, resolve issues such as an ISR firing after semaphore initialization, and allow us to match each transfer with its ISR one‑to‑one without using Please share your thoughts on this approach. |
For send cmd, i think it can be done since each send will correspond to a send cmd done isr. But for read cmd, it will trigger minimum two isr, when to disable the irq?
|
I think it should be disabled after reading the cmd ISR. This PR's final implementation goals are
|
Right now i think they are already paired with the help of 'send_cmd_done' and 'receive_cmd_done'. how about below code when timeout |
|
Hello @zhongnuo-tang Thank you for the suggestion. This three-step clearance process looks good. Additionally, I have one more suggestion. amebasmart_mipidsi_isr function like this. reg_val2 = MIPI_DSI_INTS_ACPU_Get(MIPIx);
MIPI_DSI_INTS_ACPU_Clr(MIPIx, reg_val2);
if (reg_val & MIPI_BIT_CMD_TXDONE) {
reg_val &= ~MIPI_BIT_CMD_TXDONE;
sem_post(&g_send_cmd_done);
}
if (reg_val & MIPI_BIT_RCMD1) {
amebasmart_mipidsi_rcmd_decode(MIPIx, 0);
rcmd_handled = true;
}
if (reg_val & MIPI_BIT_RCMD2) {
amebasmart_mipidsi_rcmd_decode(MIPIx, 1);
rcmd_handled = true;
}
if (reg_val & MIPI_BIT_RCMD3) {
amebasmart_mipidsi_rcmd_decode(MIPIx, 2);
rcmd_handled = true;
}
if (reg_val & (MIPI_BIT_RCMD1 | MIPI_BIT_RCMD2 | MIPI_BIT_RCMD3)){
if (rx_data_ptr) {
mipillvdbg("RCMD ");
for (int i = 0; i < rx_data_len; i ++) {
mipillvdbg("%x ", rx_data_ptr[i]);
}
mipillvdbg("\n" );
}
}
if (rcmd_handled) {
sem_post(&g_read_cmd_done);
}Please let me know if you have any concerns about this approach. |
| if (send_cmd_done == 0) { | ||
| sem_init(&g_send_cmd_done, 0, 0); | ||
| } | ||
| if (receive_cmd_done == 0) { | ||
| sem_init(&g_read_cmd_done, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you initialize to 0 with this here, the value changes to 1 when IRQ appears.
Do we want this kind of situation?
This appears to be timeout operation handling.
Should we not mask IRQ even though timeout occurred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only get IRQ at the time we want it.
Send case, you can throw it and complete it automatically However, the next sending or tx fulled must be queued for tx complate. or send blocking(current)
Recive case, you must be queued or timed out.
If you don't want to wait for read blocking, you have to create flow using static read buffer here.
This BSP code case is send blocking and recive blocking.
If you blocking here, we must control mask mipi IRQ, so that can get IRQ between transfer fuction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, MIPI isr is enabled when we switch to cmd mode, and disable after switch to video mode OR at the end of cmd sent if the msg->type == MIPI_DSI_END_OF_TRANSMISSION when sending a cmd table. so it should be already handled.
For time out case, if we are using semaphore, it is very hard to handle it when timeout happen as we do not know when the timeout IRQ will fired.. I realized that even if we disable and clear the pending isr, it may not be sufficient..
Do you have any recommendation on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Controlling IRQ masking only at the beginning and end means that we have to handle incorrect packets and delayed RX packets.
The incorrect packets are already handled for mipi reset.
When we make an RX request, we send a command via TX. Therefore, the next RX request after a timeout could be for a different command, so the delayed (timed-out) RX packets should be ignored.
i think so, if we disable IRQ, the other cpu can already get IRQ.
The PR code is already notificating to ISR using rx_data_ptr with masking IRQ.
rx_data_ptr = NULL;
rx_data_len = 0;so, i suggest, decide whether if you handling delayed IRQ or not, using checking tx_data_ptr and rx_data_ptr with spin_lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like below.
spin_lock(&g_tx);
if (send_cmd_done == 0) {
send_cmd_done = 1;
sem_post(&g_send_cmd_done);
} else {
spin_unlock(&g_tx);
return;
}
spin_unlock(&g_tx);
...
spin_lock(&g_rx);
if (receive_cmd_done == 0) {
// handle rx;
sem_post(&g_read_cmd_done);
}
spin_unlock(&g_rx);
flags = irqsave();
...
ret = sem_timedwait(&g_send_cmd_done, &abstime);
spin_lock(&g_tx);
if (send_cmd_done == 0) {
send_cmd_done = 1;
receive_cmd_done = 1;
spin_unlock(&g_tx);
return -ETIMEOUT;
}
spin_unlock(&g_tx);
if (MIPI_LPTX_IS_READ(msg->type)) {
...
ret = sem_timedwait(&g_read_cmd_done, &abstime);
spin_lock(&g_rx);
if (receive_cmd_done == 0) {
receive_cmd_done = 1;
if (msg->rx_buf && msg->rx_len > 0 ) {
rx_data_ptr = NULL;
rx_data_len = 0;
}
ret = -ETIMEOUT;
}
spin_unlock(&g_rx);
}
...
irqrestore(flags); //init
send_cmd_done = 1;
receive_cmd_done = 1;
irq enable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried with this implementation, with slight modification to enable cmd mode IRQ every time we call transfer. This is because the irq will be disabled if timeout happen in the ISR.
Tested with below
- detach the lcd, send mipi read, -> timeout
- attach back the lcd, call power_on
- run lcd_test stress_test start 2 and working OK.
I'm not sure if this could handle all the corner cases for SMP.. Please help to review thanks.
…eing called 1. change condition from receive_cmd_done flag to MIPI_LPTX_IS_READ, resolve the issue when rx isr is called earlier than sem_timedwait. 2. re-initialize semaphore when timeout happen.
1. rx_data_ptr is assigned from msg->rx_buf, it might be freed memory address or using memory for others. So it is not safety to clear in IRQ. 2. assign it to NULL instead
bdf7f99 to
91a5299
Compare
| if (send_cmd_done == 0) { | ||
| sem_init(&g_send_cmd_done, 0, 0); | ||
| } | ||
| if (receive_cmd_done == 0) { | ||
| sem_init(&g_read_cmd_done, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Controlling IRQ masking only at the beginning and end means that we have to handle incorrect packets and delayed RX packets.
The incorrect packets are already handled for mipi reset.
When we make an RX request, we send a command via TX. Therefore, the next RX request after a timeout could be for a different command, so the delayed (timed-out) RX packets should be ignored.
i think so, if we disable IRQ, the other cpu can already get IRQ.
The PR code is already notificating to ISR using rx_data_ptr with masking IRQ.
rx_data_ptr = NULL;
rx_data_len = 0;so, i suggest, decide whether if you handling delayed IRQ or not, using checking tx_data_ptr and rx_data_ptr with spin_lock.
1. handle EINTR error in sem_timedwait.
| ret = OK; | ||
|
|
||
| cleanup: | ||
| leave_critical_section(flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think reset flow should be handled in one place.
How about in above two semaphore timeout case, just set return value to -ETIMEDOUT and reset receive_cmd_done, send_cmd_done and else in cleanup code.
And for use this approach, leave_critical_section(flags); should be after all cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think current flow should be ok, we need to
leave_critical_section(flags) spin_unlock(&g_rtl8730e_config_dev_s_underflow); and bsp_pm_domain_control(BSP_MIPI_DRV, 0); in all cases even for timeout so that they are in pairs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, I also think current flow is good.
But when sem_timedwait(&g_send_cmd_done, &abstime) timeout occurs and send_cmd_done = 0,
simple flow will be like below.
send_cmd_done = 1;
receive_cmd_done = 1;
rx_data_ptr = NULL;
rx_data_len = 0;
ret = -ETIMEDOUT;
goto cleanup;
cleanup:
leave_critical_section(flags);
send_cmd_done = 1;
receive_cmd_done = 1;
if (rx_data_rdy) {
rx_data_ptr = NULL;
rx_data_len = 0;
}
#ifdef CONFIG_SMP
spin_unlock(&g_rtl8730e_config_dev_s_underflow);
#endif
#ifdef CONFIG_PM
bsp_pm_domain_control(BSP_MIPI_DRV, 0);
#endif
return ret;
What I mean is clearing process runs twice. So how about remove clearing code in timeout handling code.
same for g_read_cmd_done timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you referring to below code are running twice?
send_cmd_done = 1;
receive_cmd_done = 1;
rx_data_ptr = NULL;
rx_data_len = 0;
how about changing in below timeout case
if (send_cmd_done == 0) {
ret = -ETIMEDOUT;
goto cleanup;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. When goto cleanup, those variables will be cleared.
My question is do we need to clear those values twice.
Or is there any other corner case, when we remove clearing code in timeout case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think clearing it once should be sufficient.
I have updated again, could you check if the flow
dda8dae to
b1e2ed4
Compare
| spin_lock(&g_rx_lock); | ||
| receive_cmd_done = 0; | ||
| spin_unlock(&g_rx_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This time, we have to disable IRQ.
the blow case is deadlock
- spi_lock(g_rx_lock)
- mipi IRQ come -> spin_lock(g_rx_lock)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have changed the position of enter_critical_sectrion to before first spin_lock and and leave_critical_section after last spin_unlock in cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excessive use of enter_critical_sectrion can affect performance.
enter_critical_sectrion is usually used for protect schedulring resources between cpu.
Could you please check that is irqsave enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- calling the transfer function at the same time is a competition between threads, so you have to use the sem.
- i think the competition between transfer and mipi irq is enough only with spinlock and irqsave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about SMP case, if we do not use enter_critical_section, it might cause deadlock
if cpu0 spinlock but cpu1 irq comes and call spin lock again like what you mentioned earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to clear this case. enter_critical_section is not good solution.
- cpu0(thread) spinlock(send)
- cpu1(IRQ) spinlock(send)
-> cpu0 have to spinunlock(send), it's normal case
but
- cpu0(thread) spinlock(send)
- cpu1(IRQ) spinlock(recv)
- cpu1(IRQ) spinlock(send)
- cpu0(thread) spinlock(recv)
-> this case is deadlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok in that case irq_save should be sufficient.
for you deadlock case, i think it should not be possible for
- cpu0(thread) spinlock(send)
- cpu1(IRQ) spinlock(recv)
- cpu1(IRQ) spinlock(send) --> can this happen if IRQ is same priority? i think there shouldn't be a nesting, please correct me if i'm wrong
- cpu0(thread) spinlock(recv) --> cpu1(IRQ) spinlock(recv) should be unlocked in the CPU1(IRQ)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tizen Lite don't support nest IRQ for genenal IRQ.
when IRQ is comming, firstly masking IRQ in os side
b1e2ed4 to
2788cea
Compare
1. when timeout happen, set flag to 1 so to ignore the delayed IRQ 2. enable cmd mode irq during transfer, as the cmd mode irq will be disabled if timeout happen, in such case it will be blocked by sema when retries
2788cea to
b6af268
Compare
…eing called